Decision Trees

Decision trees are denoted using the <TREE> tag within the object structure. Tree structures are inherently recursive in nature and so is the XML that represents them. It is perhaps best to explain the XML tree structure by comparing how the Expenses sample above compares to how the tree is displayed in Knowledge Builder, Figure 1 shows this.

Figure1. Expenses Tree

The root node, Grade has three splits coming off of it (Director, Senior Manager, and Junior Manager). In the XML it is represented like this:

        ...
        <NODE ATT="Grade" TYPE="LIST">
          <SPLIT>
            <LISTVAL VALUE="Director"/>
             ...
          </SPLIT>
          <SPLIT>
            <LISTVAL VALUE="Senior_Manager"/>
             ...
          </SPLIT>
          <SPLIT>
            <LISTVAL VALUE="Junior_Manager"/>
             ...
          </SPLIT>
        </NODE>
        ...

The <NODE> tag, has two attributes associated with it. The ATT attribute specifies the name of the tied object and the TYPE attribute specifies its type. If the tied object is a numeric then an additional tag is used called DECS to specify the number of decimal places. Within the <NODE> structure you can see the three sets of <SPLIT> tags which each hold <LISTVAL> tags that contain information about what the split is on. The VALUE attribute holds the object value name that is being split on. <LISTVAL> tags apply to List or Boolean tied objects, whereas <NUMVAL> tags apply to Numeric or Date tied objects. Another type of slit would be an attribute comparitor or the <ATTCOMP> tag. Examples of both of these and how they appear in Knowledge Builder are shown in Figures 2a and 2b.

Figure 2a Example of split

...
<NODE ATT="Cost" TYPE="NUMERIC" DECS="0">
  <SPLIT>
    <NUMVAL VALUE="50" COMP="LT"/>
    ...    
  </SPLIT>
  <SPLIT>
    <NUMVAL VALUE="50" COMP="GE"/>
    ...
  </SPLIT>
</NODE>
...

Figure 2b Example of split

...
<NODE ATT="CaseVoltage" TYPE="LIST">
  <SPLIT>
    <ATTCOMP ATT="CountryVoltage" TYPE="LIST" COMP="EQ"/>
    ...
  </SPLIT>
  <SPLIT>
    <ATTCOMP ATT="CountryVoltage" TYPE="LIST" COMP="NE"/>
    ...
  </SPLIT>
</NODE>
...

Both of the examples in figure two introduce the COMP attribute. This is the comparitor or threshold for the split.

The possible comparitor symbols and what they mean are shown below in Table 1.

ComparitorMeaning

LTLess Than

GEGreater than or Equal to

LELess than or Equal to

GTGreater Than

EQEqual to

NENot Equal to

NONENo threshold

SUBSETSubset

SUPERSETSuperset

INTERSECTIntersection

Table1. Tree Comparitors

The next area of decision trees is the leaf values or outcomes. The <LEAF> tag represents this and all it contains is the outcome value:

...
<LEAF VALUE="Reject" LEAF_FREQ="4"/>
...

The example above simply uses the VALUE attribute to set the outcome to "Reject". The LEAF_FREQ attribute can be left out of any ImportXML calls. Empty tree outcomes are represents with a VALUE of "Empty".

Figure 3 show the Claims decision tree from the Expenses example which holds two inline nodes.

Figure3. Claims Tree

This would produce an XML representation like:

        ...
        <INLINE ATT="Cost" TYPE="OBJECT" TIEDTYPE="NUMERIC">
          <INLINE ATT="Calculate_Less_Tax_PRC" TYPE="PROCEDURE">
            <NODE ATT="Services" TYPE="LIST">
              <SPLIT>
                ...
              </SPLIT>
              <SPLIT>
                ...
              </SPLIT>
            </NODE>
          </INLINE>
        </INLINE>
        ...

As you can see the if the type of node is an object then a TIEDTYPE attribute is required to indicate the nodes type, otherwise the type can be PROCEDURE, DIALOG, REPORT to indicate objects of that type.